Evaluation Resources for Concept-based Cross-Lingual Information Retrieval in the Medical Domain

نویسندگان

  • Paul Buitelaar
  • Diana Steffen
  • Martin Volk
  • Dominic Widdows
  • Bogdan Sacaleanu
  • Spela Vintar
  • Stanley Peters
  • Hans Uszkoreit
چکیده

The paper describes evaluation resources for concept-based, cross-lingual information retrieval in the medical domain. All resources were constructed in the context of the MuchMore project and are freely available through the project website. Available resources include: a bilingual, parallel document collection of German and English medical scientific abstracts, a set of queries and corresponding relevance assessments, two manually disambiguated test sets for semantic annotation (sense disambiguation), two evaluation lists for German morphological decomposition of medical terms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

KI-Zeitschrift - Auszug als Leseprobe

MuchMore provides a framework for integrating and refining existing technologies and developing new approaches to cross-lingual information retrieval (CLIR) for the medical domain. Existence of very large ontologies of domain concepts and extensive corpora for the medical domain has grounded the work toward refinement, integration and comparison of concept-based retrieval methods and corpora-ba...

متن کامل

A Systematic Evaluation of Concept-based Cross-Lingual Information Retrieval in the Medical Domain

The paper describes experiments and results of the MuchMore project1, which is concerned with a systematic comparison of concept-based and corpus-based methods in cross-language information retrieval (CLIR) in the medical domain. Primary goals of the project are to develop and evaluate methods for the effective use of multilingual thesauri in the semantic annotation of English and German medica...

متن کامل

Cross-Lingual Medical Information Retrieval through Semantic Annotation

We present a framework for concept-based, cross-lingual information retrieval (CLIR) in the medical domain, which is under development in the MUCHMORE project. Our approach is based on using the Unified Medical Language System (UMLS) as the primary source of semantic data, whereby documents and queries are annotated with multiple layers of linguistic information. Linguistic processing includes ...

متن کامل

Generating Cross-lingual Concept Space from Parallel Corpora on the Web

The information available in languages other than English on the World Wide Web is increasing significantly. To cross language boundaries between different languages, dictionaries are the most typical tools. However, the general-purpose dictionary is less sensitive in genre and domain and it is impractical to manually construct tailored bilingual dictionaries or sophisticated multilingual thesa...

متن کامل

Multilingual Test Sets for Machine Translation of Search Queries for Cross-Lingual Information Retrieval in the Medical Domain

This paper presents development and test sets for machine translation of search queries in cross-lingual information retrieval in the medical domain. The data consists of the total of 1,508 real user queries in English translated to Czech, German, and French. We describe the translation and review process involving medical professionals and present a baseline experiment where our data sets are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004